Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the single process, multi-gpu feature. #1936

Merged
merged 9 commits into from
Nov 14, 2024
Merged

Conversation

joaander
Copy link
Member

@joaander joaander commented Nov 13, 2024

Description

Remove the single process, multi-gpu feature.

Using Multiple GPUs via MPI domain decomposition is still supported.

Motivation and context

Resolves #1664

How has this been tested?

CI checks.

Checklist:

  • I have reviewed the Contributor Guidelines.
  • I agree with the terms of the HOOMD-blue Contributor Agreement.
  • My name is on the list of contributors (sphinx-doc/credits.rst) in the pull request source branch.
  • I have summarized these changes in CHANGELOG.rst following the established format.

@mphoward
Copy link
Collaborator

@joaander I saw this PR open up (great!) and had a quick question: what does removing this code path mean for GlobalArray vs. GPUArray for developers in current & future code? I remember the GlobalArray pattern being rather complicated (CRTP) but in the end, I think degrading to essentially just a GPUArray if there is only 1 GPU per process?

GlobalArray<> supports all functionality that GPUArray<> does, and should eventually replace
GPUArray. In fact, for performance considerations in single GPU situations, GlobalArray
internally falls back on GPUArray (and whenever it doesn't have an ExecutionConfiguration). This
behavior is controlled by the result of ExecutionConfiguration::allConcurrentManagedAccess().

@joaander
Copy link
Member Author

I am planning to leave GlobalArray in place for now. A future PR might remove it (and GlobalVector). Care must be taken in cases where force_managed is set to true:

bool force_managed = false)
: m_exec_conf(exec_conf),
#ifndef ALWAYS_USE_MANAGED_MEMORY
// explicit copy should be elided
m_fallback((exec_conf->allConcurrentManagedAccess()
|| (force_managed && exec_conf->isCUDAEnabled()))
? GPUArray<T>()
: GPUArray<T>(num_elements, exec_conf)),
#endif
m_num_elements(num_elements), m_pitch(num_elements), m_height(1), m_acquired(false),
m_tag(tag), m_align_bytes(0),
m_is_managed(exec_conf->allConcurrentManagedAccess()
|| (force_managed && exec_conf->isCUDAEnabled()))

These cases will need to maintain the use of managed memory.

In all cases where force_managed = false (the default), GlobalArray is GPUArray now. Future code should continue to use GPUArray in all cases that managed memory is not necessary.

@joaander
Copy link
Member Author

It would appear that force_managed is never set true. In that case, maybe the GlobalArray / GlobalVector removal is coming soon as it is primarily a global search and replace operation.

@joaander joaander added validate Execute long running validation tests on pull requests release Build and unit test all support compiler/python configurations labels Nov 13, 2024
@joaander joaander marked this pull request as ready for review November 14, 2024 11:08
@joaander joaander merged commit 7c8ceac into trunk-major Nov 14, 2024
66 checks passed
@joaander joaander deleted the remove-multigpu branch November 14, 2024 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release Build and unit test all support compiler/python configurations validate Execute long running validation tests on pull requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants